Skip to content

New Catalog Configuration via CREATE SERVER#242

Merged
sfc-gh-npuka merged 26 commits into
mainfrom
naisila/catalog_reconfig
Jun 4, 2026
Merged

New Catalog Configuration via CREATE SERVER#242
sfc-gh-npuka merged 26 commits into
mainfrom
naisila/catalog_reconfig

Conversation

@sfc-gh-npuka

@sfc-gh-npuka sfc-gh-npuka commented Mar 2, 2026

Copy link
Copy Markdown
Collaborator

Note to reviewer: Let's assume we can save client secrets as server options for now. The credentials are part of USER MAPPING in this PR #255

▶️ How users can create REST catalogs

CREATE SERVER my_polaris TYPE 'rest'
  FOREIGN DATA WRAPPER iceberg_catalog
  OPTIONS (rest_endpoint 'http://polaris:8181',
           client_id '...', client_secret '...',
           location_prefix 's3://my-bucket/warehouse');

CREATE TABLE t (a int) USING iceberg WITH (catalog = 'my_polaris');

A server with no options is also valid — all settings fall back to GUCs. This allows creating a named handle for organizational purposes while relying entirely on the global GUC configuration.

▶️ Accepted Server Options

The iceberg_catalog_validator accepts the following server-level options (all defined in iceberg_catalog_option_descs[]):

Option Description
rest_endpoint REST catalog base URL (required; non-empty, must have URI scheme)
rest_auth_type 'oauth2' (default), 'default' (alias for oauth2), or 'horizon'
oauth_endpoint Custom OAuth token endpoint
scope OAuth scope (default from GUC: PRINCIPAL_ROLE:ALL)
enable_vended_credentials Boolean, default true
location_prefix S3/storage prefix, overrides the GUC default
catalog_name Catalog name passed to REST API (affects read-only table CREATE TABLE; ignored at runtime for writable tables)
client_id OAuth client ID
client_secret OAuth client secret

Credentials (client_id, client_secret) are currently stored as server options. USER MAPPING support is in a follow-up PR.

▶️ catalog_name Server Option

The catalog_name option on an iceberg_catalog server specifies the catalog prefix used in REST API URLs (opts->catalogName).

At CREATE TABLE time (read-only tables): if the table does not specify catalog_name, it inherits the server's catalog_name option (via ResolveRestCatalogOptions) or defaults to get_database_name(MyDatabaseId). The resolved value is baked into the table's ftoptions.

At runtime (GetRestCatalogName):

  • Writable tables always return get_database_name(MyDatabaseId). Server and table catalog_name options are ignored. This prevents a subsequent ALTER SERVER ... ADD/SET catalog_name from silently re-routing an existing writable table.
  • Read-only tables return the table-level catalog_name from ftoptions. If missing, error (should never happen after CREATE TABLE inheritance).

Writable tables cannot be created on a server with catalog_name set — enforced at CREATE TABLE time.

▶️ Extension Upgrade Script

The upgrade script creates:

  1. iceberg_catalog FDW — a handler-less FDW with a validator function (iceberg_catalog_validator) that accepts server-level options.
  2. GRANT USAGE ON FOREIGN DATA WRAPPER iceberg_catalog TO lake_write — so non-superusers with lake_write can create catalog servers.
  3. Three built-in catalog serverspg_lake_postgres_catalog (TYPE 'postgres'), pg_lake_object_store_catalog (TYPE 'object_store'), and pg_lake_rest_catalog (TYPE 'rest'). These are extension-owned structural anchors backed by actual pg_foreign_server rows, but carry no options — all configuration comes from GUCs.
  4. GRANT USAGE ON FOREIGN SERVER ... TO lake_write — for each built-in server, so non-superusers with lake_write can create tables against the built-in catalogs.

The reserved user-facing catalog names postgres, object_store, and rest are mapped internally to these built-in server names via ResolveCatalogServerName. A pre-flight check in the upgrade script prevents conflicts if a user already has a server with one of the built-in long names.

▶️ DDL Protection (ValidateIcebergCatalogServerDDL)

A single ProcessUtility hook validates all DDL on iceberg_catalog servers, the iceberg_catalog FDW itself, and extension-membership changes that affect any of these objects. The handler is registered in pg_lake_iceberg/src/init.c.

Two layers of protection for server DDL:

  1. Short reserved names ('postgres', 'object_store', 'rest') — the user-facing catalog= values. CREATE SERVER and RENAME TO these names are blocked so users can't shadow the built-in catalogs.
  2. Built-in long server names (pg_lake_postgres_catalog, etc.) — the pre-created anchors. Outside of CREATE/ALTER EXTENSION, CREATE/ALTER/RENAME/OWNER on them is blocked entirely so they remain pure structural anchors with all configuration in GUCs. DROP is blocked by PostgreSQL core via extension membership (pg_depend deptype 'e').
Operation Effect
CREATE SERVER with reserved short name (postgres, object_store, rest, case-insensitive) Blocked
CREATE SERVER with built-in long name (pg_lake_rest_catalog, etc.) Blocked
CREATE SERVER with TYPE 'postgres' or 'object_store' Blocked
CREATE SERVER without TYPE 'rest' (NULL or non-rest) Blocked
CREATE SERVER with TYPE 'rest' and a non-reserved name Allowed
ALTER SERVER on a built-in server (any change) Blocked
ALTER SERVER ... RENAME TO any reserved or built-in name Blocked
ALTER SERVER ... RENAME TO any name on an iceberg_catalog server Blocked (dependent tables store catalog='<name>' in ftoptions)
ALTER SERVER ... OWNER TO on a built-in server Blocked
ALTER SERVER OPTIONS (SET/ADD rest_endpoint ...) when dependent REST iceberg tables exist Blocked (writable and read-only, via ServerHasDependentRestIcebergTable)
DROP SERVER on a built-in server Blocked by PostgreSQL core (extension-owned)
ALTER EXTENSION pg_lake_iceberg DROP SERVER <built-in> Blocked (defense-in-depth; requires superuser)
ALTER EXTENSION pg_lake_iceberg DROP FOREIGN DATA WRAPPER iceberg_catalog Blocked (defense-in-depth; requires superuser)
ALTER FOREIGN DATA WRAPPER iceberg_catalog (any change) Blocked (defense-in-depth; requires superuser)
CREATE FOREIGN TABLE on any iceberg_catalog server Blocked (FDW has no handler; hints to use CREATE TABLE ... USING iceberg)
All other DDL on user-created servers Allowed

All protection checks are skipped during CREATE EXTENSION / ALTER EXTENSION UPDATE. The ALTER EXTENSION DROP and ALTER FOREIGN DATA WRAPPER guards are defense-in-depth against superuser misuse: both require superuser privilege, but detaching the extension edge or swapping the validator would silently weaken the protection model.

▶️ Single-Catalog Transaction Constraint

Modifying tables from different REST catalogs in the same transaction is rejected at statement time — before any Parquet data is written to S3. The first mutation binds the transaction to its REST catalog; any subsequent mutation targeting a different catalog raises ERRCODE_FEATURE_NOT_SUPPORTED immediately.

This applies to all mutation entry points: INSERT, UPDATE, DELETE, and TRUNCATE, as well as DDL paths (CREATE TABLE / DROP TABLE) which reach the same protection indirectly.

▶️ Backward Compatibility

  • CREATE TABLE ... WITH (catalog = 'rest') continues to work. The catalog option value 'rest' is mapped internally to the built-in pg_lake_rest_catalog server, whose configuration comes entirely from GUCs (case-insensitive: 'REST', 'rEst', etc. all resolve to the same built-in server).
  • All existing GUCs remain functional and serve as fallback defaults for both built-in and user-created catalogs.
  • No changes to postgres or object_store catalog behavior.
  • A pre-existing CREATE SERVER postgres FOREIGN DATA WRAPPER postgres_fdw does not conflict with pg_lake's built-in postgres catalog, since the user-facing short name 'postgres' maps internally to pg_lake_postgres_catalog.

Implementation Internals

▶️ Connection Resolution

ResolveRestCatalogOptions(catalog) is the single entry point for building a RestCatalogOptions. GetRestCatalogOptionsForRelation(relationId) reads the table's catalog option and delegates to it.

All catalogs — built-in and user-created — go through the same resolution path:

  1. ResolveRestCatalogOptions(catalog) — calls ResolveCatalogServerName(catalog) to map short reserved names ('rest'pg_lake_rest_catalog, 'postgres'pg_lake_postgres_catalog, 'object_store'pg_lake_object_store_catalog); user-created server names pass through unchanged. Then delegates to BuildRestCatalogOptionsFromServer(serverName, catalog).
  2. BuildRestCatalogOptionsFromServer(serverName, userVisibleCatalog) — looks up the server in pg_foreign_server, initializes all fields from GUC defaults via ApplyGUCDefaults, then overrides with any options explicitly set on the server via ApplyServerOptionOverrides (driven by iceberg_catalog_option_descs[]). The userVisibleCatalog (the short name the user typed) is stored in opts->catalog so error messages, the token cache key, and the cross-catalog DML check stay in user-facing terms.
  3. ValidateRestCatalogOptions — errors if rest_endpoint is still unset after both sources.

Since ALTER SERVER OPTIONS is blocked on the built-in servers, their option set is always empty and the GUC defaults survive untouched — achieving the "GUCs-only built-in REST" behavior through a single code path.

catalog_name on the server is stored in opts->catalogName when present, but it is not defaulted to get_database_name() inside resolution. For read-only tables, the database-name default is applied at CREATE TABLE time: if the table does not specify catalog_name, create_table.c inherits the server's catalog_name option (if set) or defaults to get_database_name(MyDatabaseId), then bakes the result into the table's ftoptions. Writable tables ignore server catalog_name at runtime (see above).

▶️ Option Validation

Server options are defined once in iceberg_catalog_option_descs[] — the single source of truth for the whitelist (FindCatalogOptionDesc), the user-facing hint (GetValidCatalogOptionsHint), and the option-to-struct applier (ApplyCatalogOptionValue). Adding or removing an option requires changing only this table.

FDW option names and auth type values are compared case-insensitively (pg_strcasecmp). Server names remain case-sensitive (PostgreSQL's GetForeignServerByName is case-sensitive), but the reserved built-in names are checked case-insensitively.

At CREATE/ALTER SERVER time, ValidateCatalogOptionValue enforces:

  • rest_endpoint, oauth_endpoint, location_prefix: non-empty and must contain :// (URI scheme required)
  • catalog_name, client_id, client_secret, scope: non-empty when present
  • rest_auth_type: must be 'oauth2', 'default' (alias for oauth2), or 'horizon'
  • enable_vended_credentials: valid boolean

The validator hint string is built lazily from iceberg_catalog_option_descs[] and cached in TopMemoryContext so it survives transaction aborts on the error path (a per-statement context would leave a dangling pointer on the second invalid-option failure in the same backend).

▶️ Per-Catalog Token Cache

The old code used a single global token (RestCatalogAccessToken). With multiple catalogs potentially using different credentials, the token cache is now a hash table (RestCatalogTokenCache) keyed by the iceberg_catalog server OID (opts->serverOid). Each catalog gets its own cached OAuth token with independent expiry tracking.

The hash table and token strings are allocated in a dedicated RestTokenCacheCtx memory context (under CacheMemoryContext), keeping the cache memory isolated.

Invalidation on server DDL: a syscache invalidation callback (InvalidateRestTokenCache) is registered once per backend via CacheRegisterSyscacheCallback(FOREIGNSERVEROID, ...). Any ALTER SERVER or DROP SERVER resets the entire token cache (MemoryContextReset(RestTokenCacheCtx)) so stale credentials are never reused in open sessions. The cache is rebuilt lazily on the next token lookup. TokenCacheCallbackRegistered ensures the callback is registered exactly once (the syscache callback array is fixed-size).

SendRequestToRestCatalog implements its own retry loop (up to 3 retries) with a ClassifyRestCatalogRequestRetry classifier that determines whether to back off, refresh auth, or stop. When a 419 (token expired) response is received, the retry logic force-refreshes the token via GetRestCatalogAccessToken(opts, true) and patches the Authorization header in-place via UpdateAuthorizationHeader. The opts parameter is passed directly to SendRequestToRestCatalog, avoiding any global state. Pass opts = NULL for the token-fetch request itself to avoid recursion.

▶️ Single-Catalog Transaction Constraint (Internals)

BindRelationToXactRestCatalog(relationId) is called from every entry point that can mutate REST-backed iceberg tables:

  • postgresBeginForeignModify() — row-by-row DML
  • AddQueryResultToTable() — INSERT...SELECT and COPY...FROM pushdown
  • postgresExecForeignTruncate() — TRUNCATE

All per-transaction REST catalog state is grouped in a single PgLakeXactRestCatalogContext struct (static PgLakeXactRestCatalog), which contains:

  • requestsHash — hash table of per-table REST catalog requests
  • commitContext — pre-allocated 1MB memory context for XACT_COMMIT
  • catalogOpts — deep-copied RestCatalogOptions for the single catalog

On the first REST-backed write of the transaction, BindRelationToXactRestCatalog pre-resolves the relation's full RestCatalogOptions and deep-copies them into TopTransactionContext (PgLakeXactRestCatalog->catalogOpts). This happens before any Parquet data is written to S3. Subsequent writes in the same transaction must target the same catalog (compared by server OID — opts->serverOid — so case variations like 'rest' and 'REST' correctly collapse to the same built-in server); otherwise ERRCODE_FEATURE_NOT_SUPPORTED is raised immediately.

DDL paths (CREATE TABLE / DROP TABLE) reach the same protection indirectly via RecordRestCatalogRequestInTx(), which is invoked synchronously at statement time from the utility hook. That function retains a belt-and-suspenders cross-catalog check for any code path that forgets to bind at statement time.

At XACT_EVENT_COMMIT time, PostAllRestCatalogRequests uses the pre-resolved PgLakeXactRestCatalog->catalogOpts to build URLs and authentication headers. This avoids syscache lookups during commit, which are forbidden (AssertCouldGetRelation fires during TRANS_COMMIT state).

When an iceberg table is created on any catalog server — user-created or built-in — RecordIcebergCatalogServerDependency records a DEPENDENCY_NORMAL edge from the table to the server in pg_depend. For built-in catalogs, the short reserved name (e.g. 'rest') is mapped to the built-in server name (pg_lake_rest_catalog) via ResolveCatalogServerName. This ensures DROP SERVER is blocked while dependent tables exist, and DROP EXTENSION CASCADE cleans up tables that depend on the built-in servers.

▶️ Catalog Type Helpers (pg_lake_engine/src/utils/catalog_type.c)

  • IsRestCatalog(catalog) — unified check that returns true for the literal 'rest' (case-insensitive) or any iceberg_catalog server whose TYPE is 'rest'. Internal built-in server names are explicitly rejected (they are implementation details, not user-facing catalog values). Used by the GUC check hook for default_catalog to accept user-created server names.

  • IsCatalogOwnedByExtension(catalog) — returns true for the short reserved names 'rest', 'object_store', or 'postgres' (all case-insensitive via pg_strcasecmp). Used by the DDL protection handler, dependency recording, and ACL checks.

  • IsBuiltinCatalogServerName(serverName) — returns true for the long built-in server names (pg_lake_rest_catalog, pg_lake_postgres_catalog, pg_lake_object_store_catalog), case-insensitive. The long-name counterpart to IsCatalogOwnedByExtension. Used by the DDL protection hook to lock down the extension's structural anchors.

  • ResolveCatalogServerName(catalog) — maps user-facing short names to the corresponding built-in server name; passes user-created server names through unchanged.

Fixes part of #230


Checklist

  • Probably need to intercept create server commands that use variations of case-insensitive "postgres", "rest" and "object_store" reserved literals.
  • I have tested my changes and added tests if necessary
  • I updated documentation if needed
  • I confirm that all my commits are signed off (DCO)

extern PGDLLEXPORT bool HasRestCatalogTableOption(List *options);
extern PGDLLEXPORT bool HasObjectStoreCatalogTableOption(List *options);
extern PGDLLEXPORT bool HasReadOnlyOption(List *options);
extern PGDLLEXPORT bool IsServerBasedRestCatalog(List *options);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that sounds a bit weird. maybe IsRestCatalogOwnedByExtension and IsRestCatalogOwnedByUsers

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First I renamed to IsRestCatalogOwnedByUsers but then I realized we don't really need the distinction so I changed it to IsRestCatalog. Details in the last commit of this PR

Comment thread pg_lake_engine/src/utils/catalog_type.c Outdated
Comment on lines +131 to +133
if (pg_strncasecmp(catalog, REST_CATALOG_NAME, strlen(REST_CATALOG_NAME)) == 0 ||
pg_strncasecmp(catalog, OBJECT_STORE_CATALOG_NAME, strlen(OBJECT_STORE_CATALOG_NAME)) == 0 ||
pg_strncasecmp(catalog, POSTGRES_CATALOG_NAME, strlen(POSTGRES_CATALOG_NAME)) == 0)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe IsCatalogOwnedByExtension, that catches all our catalogs

Comment thread pg_lake_engine/src/utils/catalog_type.c Outdated
if (server->servertype != NULL && *server->servertype != '\0')
return pg_strncasecmp(server->servertype, "rest", strlen("rest")) == 0;

/* No TYPE specified, assume rest */

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we disallow that?

@sfc-gh-npuka sfc-gh-npuka Mar 12, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can allow since 'rest' is the only type that is allowed for users for now

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for disallowing that, perhaps on create server, and we can assert here that TYPE cannot be NULL or empty.

static char *RestCatalogAccessToken = NULL;
static TimestampTz RestCatalogAccessTokenExpiry = 0;
* Per-server token cache. Keyed by server name (for server-based catalogs)
* or "GUC" (for GUC-based backward-compatible catalog='rest').

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of GUC, cant we still have server name for our catalogs?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense

conn->host = defGetString(def);
else if (strcmp(def->defname, "client_id") == 0)
conn->clientId = defGetString(def);
else if (strcmp(def->defname, "client_secret") == 0)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this user mapping or server option?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User mapping. Here only for the sake of separate server infrastructure.

Comment thread pg_lake_table/src/ddl/create_table.c Outdated
Comment on lines +744 to +747
if (pg_strncasecmp(catalogOptionValue, REST_CATALOG_NAME, strlen(catalogOptionValue)) == 0)
conn = GetRestCatalogConnectionFromGUCs();
else
conn = GetRestCatalogConnectionFromServer(catalogOptionValue);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a wrapper seems useful

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since both extension-owned and user-created REST type catalog servers fallback to GUC, there is no need for a distinction here with if-else, as I changed in my last commit.

Comment thread pg_lake_table/src/ddl/create_table.c Outdated

metadataLocation =
GetMetadataLocationFromRestCatalog(catalogName, catalogNamespace, catalogTableName);
GetMetadataLocationFromRestCatalog(conn, catalogName, catalogNamespace, catalogTableName);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might fetch conn inside GetMetadataLocationFromRestCatalog and do not change its signature maybe.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would either need the relation ID or the server name to fetch conn from the inside, so signature change is needed.

{
if (!requestPerTable->isValid)
{
/*

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this comment should stay.

requestPerTable->dropTableRequest != NULL)
{
/*
* table is created and dropped in the same transaction, nothing

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, comment should stay

HttpResult httpResult =
SendRequestToRestCatalog(HTTP_POST, requestPerTable->tableRestUrl,
createTableRequest->body, PostHeadersWithAuth());
createTableRequest->body, PostHeadersWithAuth(requestPerTable->conn));

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to add tests where we modify multiple rest tables inside the same transaction to cover the changes.

Comment thread pg_lake_iceberg/src/rest_catalog/rest_catalog.c Outdated
@sfc-gh-npuka sfc-gh-npuka linked an issue Mar 9, 2026 that may be closed by this pull request
8 tasks
@sfc-gh-npuka sfc-gh-npuka mentioned this pull request Mar 9, 2026
8 tasks
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_reconfig branch 4 times, most recently from 14b402b to ca9adb0 Compare March 12, 2026 11:45
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_reconfig branch from ca9adb0 to c613b0f Compare March 12, 2026 16:13

@sfc-gh-abozkurt sfc-gh-abozkurt left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job! It seems very close to merge to me.

extern PGDLLEXPORT RestCatalogConnectionInfo * GetRestCatalogConnectionForRelation(Oid relationId);

/* FDW name for iceberg_catalog servers */
#define ICEBERG_CATALOG_FDW_NAME "iceberg_catalog"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this can be a more generic place

ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("invalid rest_auth_type option: \"%s\"", authType),
errhint("Valid values are \"default\" and \"horizon\".")));

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: does it makes sense to name default (oauth2 or such)?

* - ALTER ... RENAME on 'postgres', 'object_store', or 'rest' is blocked.
*/
bool
ProtectExtensionCatalogServersHandler(ProcessUtilityParams *processUtilityParams,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe BlockDdlOnExtensionCatalogs

}
else if (IsA(parsetree, AlterForeignServerStmt))
{
AlterForeignServerStmt *stmt = (AlterForeignServerStmt *) parsetree;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test tho show ALTER SERVER OWNER TO is also disabled?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I may need to intercept AlterOwnerStmt for that

ForeignServer *server = GetForeignServerByName(serverName, false);
ForeignDataWrapper *fdw = GetForeignDataWrapper(server->fdwid);

if (strcmp(fdw->fdwname, ICEBERG_CATALOG_FDW_NAME) != 0)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we prevent that at CREATE FOREIGN SERVER time by hooking into it? Here we can have assertion.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - actually by this point the server is guaranteed to be iceberg_catalog, so just an Assert is sufficient.

if (!RestCatalogClientId || !*RestCatalogClientId)
ereport(ERROR, (errmsg("pg_lake_iceberg.rest_catalog_client_id should be set")));
if (!conn->clientId || !*conn->clientId)
ereport(ERROR, (errmsg("REST catalog client_id is not configured")));

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hint could be alter server option or GUC, same for others for which it applies

Comment on lines +1494 to +1498
* Invalidate all cached tokens so that the next request will fetch a
* fresh one. We clear the entire cache because the retry callback
* does not have access to a specific connection's info. This is safe
* because token expiry is rare and other connections will simply
* re-authenticate on their next request.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldnt we invalidate only one token that belongs to the catalog which we send request to? i.e. HASH_REMOVE

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, we could pass the servername to SendRequestToRestCatalog

Comment thread pg_lake_iceberg/src/init.c Outdated

AvroInit();

RegisterUtilityStatementHandler(ProtectExtensionCatalogServersHandler, NULL);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hooks should exist in pg_lake_table extension. pg_lake_iceberg is only used for iceberg spec and catalog implementations and iceberg specific internal tables or views.

*/
while (list_length(tablesWithModifications) > 0)
{
RestCatalogRequestPerTable *firstTable =

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just asking. Cant we still send a single request? At the first sight, I see polaris open api seems to not allow it. @sfc-gh-okalaci

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't send a single HTTP request for multiple hosts. The code currently creates batches based on host.

But, now that you mention it, what if we have two different catalogs, from the same host? This could be a bug. Maybe we should batch on (host, catalog_name) and not simply on (host)

assert result[0]["srvtype"] == "postgres"


def test_precreated_object_store_server(pg_conn, extension):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

precreated_rest?

@sfc-gh-abozkurt

Copy link
Copy Markdown
Collaborator

I think we should disable table creation with iceberg_catalog fdw as it has no handler via hooks. (we already have create table hooks)

postgres=# create foreign table test (a timestamptz) server postgres;
CREATE FOREIGN TABLE
postgres=# table test;
ERROR:  foreign-data wrapper "iceberg_catalog" has no handler

@sfc-gh-npuka sfc-gh-npuka marked this pull request as ready for review March 13, 2026 09:15
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_reconfig branch from da7a92a to 6802bf9 Compare March 13, 2026 11:32
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_reconfig branch from baf1541 to 2c7435f Compare March 14, 2026 16:33
Comment thread pg_lake_table/src/transaction/track_iceberg_metadata_changes.c
{
for (int i = 0; iceberg_catalog_server_options[i] != NULL; i++)
{
if (strcmp(keyword, iceberg_catalog_server_options[i]) == 0)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be case insensitive? It is more common in pg_lake and in Postgres to allow case insensitive keywords.

Make sure it works in other places as well if we change it, for example some tests should use different cases

char *authType = defGetString(def);

if (strcmp(authType, "oauth2") != 0 &&
strcmp(authType, "default") != 0 &&

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly: can we allow case insensitive values? make sure if we allow it here, we check in other places similarly

URLEncodePath(catalogName));

HttpResult httpResult = SendRequestToRestCatalog(HTTP_POST, postUrl, body.data, PostHeadersWithAuth());
HttpResult httpResult = SendRequestToRestCatalog(HTTP_POST, postUrl, body.data,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think we have some indentation issues in the PR, such as this one. Can you maybe go over once more regarding that? Do you use make reindent? That seems to fix some of these on my local

const int MAX_HTTP_RETRY_FOR_REST_CATALOG = 3;

return SendHttpRequestWithRetry(method, url, body, headers, ShouldRetryRequestToRestCatalog, MAX_HTTP_RETRY_FOR_REST_CATALOG);
CurrentRetryServerName = serverName;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static variables are always painful to maintain, what if SendHttpRequestWithRetry throws an error? We are left with the old server name. I'm pretty sure that cause hard-to-understand issues/consequences.

I think what we are trying here is to pass ShouldRetryRequestToRestCatalog the server name.

Can't we change HttpRetryFn signature to pass the server name?

VALIDATOR lake_iceberg.iceberg_catalog_validator;

/* Pre-created catalog servers for backward compatibility */
CREATE SERVER postgres

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uf, the name postgres might be used by the users, say for an postgres_fdw server.

I wonder if we should pick something harder to hit? Can you think of anything prefix/postfix for the pre-created servers?


GetRestCatalogAccessToken(forceRefreshToken);
strlcpy(cacheKey, CurrentRetryServerName, TOKEN_CACHE_KEY_LEN);
hash_search(RestCatalogTokenCache, cacheKey, HASH_REMOVE, NULL);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, we remove the token from the cache, and return true, so the caller would simply re-use the existing headers to retry, which means will use the old token.

I think we should re-generate token and add here? Especially given we return true, we should do something like that.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, we can use GetRestCatalogAccessToken like the previous code was using. We wouldn't even need hash_search(..., HASH_REMOVE, ...) since there is existing forceRefreshToken logic.
Note that Claude is suggesting we had a previous bug of using the same pre-built headers after GetRestCatalogAccessToken(forceRefreshToken); So taking care of that as well.

@sfc-gh-npuka sfc-gh-npuka Mar 23, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened an issue for this: #276
Not sure if we should solve it in this PR or separately

EDIT: maybe better to solve this separately as this PR is getting huge #278

"rest_auth_type",
"oauth_endpoint",
"enable_vended_credentials",
"location_prefix",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we dont seem to be using location_prefix ? I mean, even if I provide that, we still use GetIcebergDefaultLocationPrefix when needed in rest_catalog.c?

I think if we support this, we should have tests such that we set GetIcebergDefaultLocationPrefix to return a location that's not working, create/modify/vacuum/ddl tables such that make sure we don't accidentally use it.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about that, huge miss from my part.

@sfc-gh-npuka sfc-gh-npuka Mar 22, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I am reusing the "remove trailing slash" code - think I'll open a separate PR about that since we are also using it for stage_location and other things.

EDIT: opened and merged separate PR on this #275

{
char *newCatalog = *newvalue;

if (pg_strncasecmp(newCatalog, POSTGRES_CATALOG_NAME, strlen(newCatalog)) == 0 ||

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aren't we supposed to allow use-defined catalogs?

I thought one of the most important pieces of this PR is to let users easily use/switch between the servers, and this seems like a good place to be able to do that?

Needs tests to show we allow user created rest server, but not allow say postgres_fdw servers or unrelated/non-existing ones

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I forgot about this part.

if (!IsIcebergCatalogServer(stmt->servername))
return false;

if (pg_strcasecmp(stmt->servername, POSTGRES_CATALOG_NAME) == 0 ||

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think postgres server name is case sensitive? https://github.com/postgres/postgres/blob/4f433025f666fa4a6209f0e847715767fb1c7ace/src/backend/foreign/foreign.c#L794

I think for server names we should use strcmp ?

ps: good idea to test two servers: CREATE SERVER "test" and CREATE SERVER "TEST" in the same db, with different options and make sure they both work fine?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could do that.
In that case, we would allow names like "Object_Store", so functions like HasObjectStoreCatalogTableOption should also be case sensitive.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on second thought, I think we should keep case insensitive comparisons against POSTGRES_CATALOG_NAME, OBJECT_STORE_CATALOG_NAME and REST_CATALOG_NAME. We can't have "rest" and "REST". But we can have "test" and "TEST".


/*
* Gets an access token from rest catalog. Caches the token per server
* (keyed by host + clientId) until it is about to expire.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment keyed by host + clientId) is stale?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, it's keyed by server name

requestPerTable = (RestCatalogRequestPerTable *) lfirst(lc);

if (!lastRequest)
if (strcmp(requestPerTable->conn->host, batchHost) != 0 ||

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll remove this code, but anyway this is not safe, we can have multiple REST catalogs in the same host? So we cannot really compare the host?

If we need this kind of grouping in any other place in the code, we should do with the OID of the server instead?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is actually checking both host and catalog name which is unique.

if (strcmp(requestPerTable->conn->host, batchHost) != 0 ||
strcmp(requestPerTable->catalogName, catalogName) != 0)

Server name/oid would also work (better since it would be a single check).

Comment thread pg_lake_table/src/init.c Outdated

MarkGUCPrefixReserved(PG_LAKE_TABLE);

RegisterUtilityStatementHandler(BlockDDLOnExtensionCatalogs, NULL);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we register this in pg_lake_icebeg? Any reason to do it in pg_lake_table?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sfc-gh-npuka and others added 26 commits June 4, 2026 20:52
…pport

Introduce the iceberg_catalog foreign data wrapper as a configuration
framework for Iceberg catalogs. This allows defining named catalog
servers via CREATE SERVER, removing the limitation of a single global
REST catalog configured through GUCs.

Key changes:
- Add iceberg_catalog_validator for server option validation
- Add RestCatalogConnectionInfo struct to unify connection parameters
- Add GetRestCatalogConnectionFromGUCs/FromServer resolution functions
- Refactor all REST catalog functions to accept RestCatalogConnectionInfo
- Refactor token cache from single global to per-server hash table
- Add extension upgrade SQL (3.2--3.3) with FDW, validator, and
  pre-created postgres/object_store servers

Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
Update IsServerBasedRestCatalog and HasRestCatalogTableOption to check
whether a catalog option value refers to an iceberg_catalog foreign
server. Servers without an explicit TYPE default to rest.

Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
Update pg_lake_iceberg_validator to accept server names as valid catalog
option values (in addition to the existing postgres/object_store/rest
literals). Update ProcessCreateIcebergTableFromForeignTableStmt to
resolve REST catalog connection info from the named server when the
catalog option refers to an iceberg_catalog server.

Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
Refactor transaction-level REST catalog request tracking to resolve and
store a RestCatalogConnectionInfo per table. Batch commit requests are
now grouped by REST catalog host, so tables using different catalog
servers within the same transaction are committed to the correct
endpoints independently.

Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
Comprehensive test suite covering:
- FDW existence and pre-created postgres/object_store servers
- CREATE SERVER with valid and invalid options
- Foreign table creation on handler-less FDW (query-time error)
- ALTER/DROP SERVER operations
- Server-based catalog references in CREATE TABLE
- Backward compatibility with catalog='rest'/'postgres'/'object_store'

Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
1 - Add extension owned rest catalog of type 'rest'
2 - Disallow creation of further 'postgres' or 'object_store'
    type catalogs unless we are creating_extension
3 - Disallow renaming/dropping extension owned catalogs.
4 - Disallow altering the options of extension-owned postgres
    and object_store catalogs.

Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
- Add helper functions:
	- IsRestCatalogOwnedByExtension ('rest' name)
	- IsCatalogOwnedByExtension ('postgres', 'object_store', 'rest')
- Rename IsServerBasedRestCatalog to IsRestCatalogOwnedByUsers
- Use servername per server token cache
- Add tests where we modify tables from different catalogs(rest_endpoints)
  in the same transaction.
- Keep some comments I accidentally removed.

Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
This means any type 'rest' server (extension or user-owned)
will use GUC options as default, and change any option to the ones
set on the server.

This eliminates the need of the function IsRestCatalogOwnedByUsers
Instead we just have IsRestCatalog function.

This also eliminates the need of GetRestCatalogConnectionFromGUCs
Instead we just have GetRestCatalogConnectionFromServer - this
function uses GUC values as defaults for all type 'rest' servers.

Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
- Block CREATE FOREIGN TABLE on iceberg_catalog servers. The
  iceberg_catalog FDW has no handler, so foreign tables created on it
  would fail at query time with "has no handler". The check is added to
  ErrorUnsupportedCreatePgLakeTableHandler in pg_lake_table, which
  already runs first for all CREATE FOREIGN TABLE statements.

- Block ALTER SERVER ... OWNER TO on extension-owned catalog servers
  (postgres, object_store, rest).

- Move ICEBERG_CATALOG_FDW_NAME from rest_catalog.h to catalog_type.h
  alongside the other catalog name constants, since it is referenced by
  both pg_lake_iceberg and pg_lake_table.

- Rename rest_auth_type value "default" to "oauth2" to better describe
  the standard OAuth2 client_credentials grant with Basic auth. "default"
  value is also kept.

- Rename ProtectExtensionCatalogServersHandler to
  BlockDDLOnExtensionCatalogs for clarity.

- Move BlockDDLOnExtensionCatalogs registration from pg_lake_iceberg
  init to pg_lake_table init, where all other ProcessUtility hooks are
  registered.

Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
Token cache improvements:
- Allocate the token cache hash table and token strings in a dedicated
  RestTokenCacheCtx memory context (under TopMemoryContext) instead of
  directly in TopMemoryContext, keeping the cache memory isolated.
- On a 419 (token expired) retry, invalidate only the affected server
  cached token instead of clearing the entire cache.  The server name is
  passed through SendRequestToRestCatalog and stored in a file-scoped
  static so the retry callback can target the right entry.

Transaction commit batching fix:
- Group batches by (host, catalogName) instead of host alone.  The
  transaction commit URL includes the catalog prefix, so two servers
  pointing to the same host but with different catalog_name values
  need separate commits.  Previously, all tables on the same host were
  batched together using only the first table catalogName.

Other improvements:
- Replace the ereport(ERROR) FDW name check in
  GetRestCatalogConnectionFromServer with an Assert, since the catalog
  option validator already ensures only iceberg_catalog servers are
  accepted.
- Add errhint to credential/host error messages in
  FetchRestCatalogAccessToken, pointing users to both the server option
  and the corresponding GUC.
- Add test_precreated_rest_server test.

Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
Block CREATE SERVER with a name that case-insensitively matches the
extension-owned catalog names (postgres, object_store, rest) on the
iceberg_catalog FDW. Also block ALTER SERVER ... RENAME TO a reserved
name.

Fix a latent bug across multiple files where pg_strncasecmp was used
with a partial length, causing prefix-only matching: e.g. "rest_1" would
match "rest". Replaced all instances with pg_strcasecmp for exact
case-insensitive comparison.

Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
REST catalog options and retry mechanism:
- Rename RestCatalogConnectionInfo to RestCatalogOptions throughout
- Eliminate CurrentRetryServerName static; pass opts directly through
  HttpRetryFn callback (void *context + List *headers) so the 419
  token-expired handler can force-refresh and patch the Authorization
  header in-place
- Fix double-free in GetRestCatalogAccessToken: null out entry fields
  before calling FetchRestCatalogAccessToken

Transaction-scoped state:
- Reject transactions that touch tables from different REST catalog
  servers
- Replace per-table rest catalog opts  deep-copy with a single
  PgLakeXactRestCatalogOpts static, deep-copied into
  TopTransactionContext on first use
- This avoids syscache lookups at XACT_EVENT_COMMIT time, which are
  forbidden (AssertCouldGetRelation fires during TRANS_COMMIT state)

Server configuration enforcement:
- Require TYPE 'rest' on CREATE SERVER ... FOREIGN DATA WRAPPER
  iceberg_catalog (reject NULL or non-rest types)
- Make FDW option names and auth type values case-insensitive
  (pg_strcasecmp), while keeping server names case-sensitive
- Make reserved catalog name checks (postgres, object_store, rest)
  case-insensitive via IsCatalogOwnedByExtension
- Support location_prefix server option, overriding the GUC default
- Accept user-created iceberg_catalog servers in default_catalog GUC

Tests:
- Add test_reject_modify_different_rest_catalogs_in_single_transaction
- Add test_server_location_prefix_overrides_guc
- Add tests for TYPE enforcement, case-sensitive server names, and
  default_catalog GUC with user-created servers
- Remove obsolete no-TYPE-defaults-to-rest tests

Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
Don't create foreign servers for the reserved catalog names in the
extension install script.  These names ('postgres', 'object_store',
'rest') could already be in use in the target database (especially
'postgres'), causing the extension script to fail.

Instead, treat them as special built-in identifiers whose configuration
comes entirely from GUC settings.  Only user-created catalogs (via
CREATE SERVER ... FOREIGN DATA WRAPPER iceberg_catalog) result in
actual foreign server objects.

Key changes:

- Remove CREATE SERVER postgres/object_store/rest from the upgrade SQL.
- GetRestCatalogOptionsFromCatalog now returns GUC-only opts for the
  built-in 'rest' name without looking up a foreign server.
- Simplify BlockDDLOnExtensionCatalogs: only guard CREATE SERVER
  (reserved names, TYPE postgres/object_store) and RENAME TO reserved
  names.  ALTER/DROP/OWNER on built-in names fail naturally since no
  server object exists.
- Remove IsIcebergCatalogServer (no longer needed).
- Rename RestCatalogOptions.serverName to .catalog and
  GetRestCatalogOptionsFromServer to GetRestCatalogOptionsFromCatalog
  to reflect the new semantics.
- Update tests

Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
The catalog_name option was accepted on iceberg_catalog servers but
never consumed.  Now GetRestCatalogOptionsFromCatalog reads it and
stores it in RestCatalogOptions.catalogName.

GetRestCatalogName uses a three-level fallback for the REST API
catalog prefix: table option > server option > database name.
This lets a server define a default catalog_name for all its tables
while still allowing per-table overrides.

- Add catalogName field to RestCatalogOptions.
- Populate it from the server catalog_name option in
  GetRestCatalogOptionsFromCatalog.
- Rewrite GetRestCatalogName to check table option first, then
  opts->catalogName.
- Writable REST catalog tables cannot use a server with
  catalog_name set, since for now they inherit from the
  database name, schema name, and table name
- Add tests

Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
Consolidate the three per-transaction REST catalog statics
(RestCatalogRequestsHash, PgLakeXactCommitContext,
PgLakeXactRestCatalogOpts) into a single RestCatalogXactState struct
to reduce global state and make lifetime management clearer.

Merge BlockDDLOnExtensionCatalogs and RequireRestTypeForIcebergCatalogServer
into a single ValidateIcebergCatalogServerDDL hook, eliminating duplicate
hook guard logic for CREATE SERVER validation.

Grant USAGE on iceberg_catalog FDW to lake_write so non-superusers
with write permissions can create catalog servers.

Additional cleanups:
- Convert unreachable host check in FetchRestCatalogAccessToken to Assert
- Rename FDW validator parameter from 'catalog' to 'catalogRelId'
- Replace two separate override tests with a single parameterized
  test_server_option_overrides_guc covering rest_endpoint, client_id,
  client_secret, location_prefix, and catalog_name
- Add tests for no-option server creation and lake_write permissions

Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
Harden catalog server security, configuration, and cross-catalog safety

Refactor GetRestCatalogOptionsFromCatalog into a clean dispatcher
(ResolveRestCatalogOptions) with two explicit builders:
BuildRestCatalogOptionsFromGUCs for the built-in 'rest' path and
BuildRestCatalogOptionsFromServer for user-created servers, plus shared
helpers ApplyGUCDefaults, ApplyServerOptionOverrides, and
ValidateRestCatalogOptions.  Add CopyRestCatalogOptions for safe
deep-copy into a target memory context.

Unify the three separate server option lists (whitelist array, errhint
string, applier chain) into a single IcebergCatalogOptionDesc descriptor
table with type, offsetof, and validation flags.  Adding or removing an
option is now a one-line change.

Add DDL-time value validation: reject empty strings for client_id,
client_secret, scope, catalog_name; require a URI scheme for
rest_endpoint, oauth_endpoint, location_prefix.

Require ACL_USAGE on user-created iceberg_catalog servers at CREATE TABLE
time, matching core's CREATE FOREIGN TABLE semantics.  Record a
DEPENDENCY_NORMAL from iceberg tables to their catalog server in
pg_depend so DROP SERVER is blocked (and CASCADE drops them).

Block ALTER SERVER RENAME for iceberg_catalog servers since dependent
tables store the server name as a string in ftoptions.

Block ALTER SERVER SET rest_endpoint when dependent writable tables exist
to prevent silently redirecting them to a different REST catalog.

Make GetRestCatalogName always return get_database_name(MyDatabaseId) for
writable tables so ALTER SERVER ADD catalog_name cannot re-route an
existing table to a different namespace.

Fix token cache hash key regression: zero the key buffer with MemSet
before strlcpy in BuildTokenCacheKey.  Add syscache invalidation callback
on FOREIGNSERVEROID to reset the token cache on ALTER/DROP SERVER, using
CacheMemoryContext as parent.  Add NULL guard on opts in
GetRestCatalogAccessToken.

Fix default_catalog GUC check hook to accept values outside a transaction
(ALTER SYSTEM + pg_reload_conf path), mirroring how PostgreSQL handles
check_default_tablespace.

Introduce ValidateXactRestCatalog as a fail-fast guard that checks
cross-catalog DML at statement time rather than at XACT_EVENT_PRE_COMMIT.
Planted in postgresBeginForeignModify and AddQueryResultToTable.  The
existing pre-commit check is retained as a belt-and-suspenders fallback.

Parametrize test_writable_rest_iceberg_table over built-in 'rest' and
user-created server paths.  Add tests for USAGE enforcement, dependency
tracking, server rename blocking, rest_endpoint blocking, catalog_name
re-routing, token cache invalidation, ALTER SYSTEM deferred validation,
option value validation, multi-table same-server transactions, and
cross-catalog rejection cleanup.

Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
…ursor's hands

Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
Fail fast at statement time on cross-catalog DML. Rename
ValidateXactRestCatalog to BindRelationToXactRestCatalog and call it from
both FDW write paths (postgresBeginForeignModify and AddQueryResultToTable).
The function now binds the transaction to the relation's REST catalog on
the first DML and rejects any subsequent statement targeting a different
catalog, so the second INSERT errors out before any Parquet is written.
The pre-commit hook is kept as a belt-and-suspenders fallback for DDL
paths that reach the tracker without going through the new guard. The
regression test asserts the second INSERT (not COMMIT) raises.

Move ValidateIcebergCatalogServerDDL registration from pg_lake_table to
pg_lake_iceberg, where the handler is actually defined. Architecturally
this puts ownership of catalog-server DDL validation in the extension
that owns the catalog server abstraction.

Fix latent dangling-pointer in GetValidCatalogOptionsHint. The static
hint cache was allocated via initStringInfo in whatever short-lived
context the validator happened to be running in (typically MessageContext),
so the second invalid-option failure in the same backend returned freed
memory to errhint. Allocate in TopMemoryContext so the cache survives
transaction boundaries. The strengthened test_reject_unknown_server_option
now issues two failing CREATE SERVER statements on the same connection and
asserts the full option list appears in the hint on both attempts; the
previous version would not have caught this bug.

Also covers earlier review-driven hardening on the same branch: token
cache invalidation on ALTER SERVER credentials, ALTER SERVER rest_endpoint
blocking for all dependent REST iceberg tables (writable and read-only),
and DROP OWNED BY / concurrent DROP SERVER dependency tests.

Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
The three reserved short catalog names ('postgres', 'object_store',
'rest') previously had no backing pg_foreign_server row, which split
the resolution path in two (BuildRestCatalogOptionsFromGUCs for the
built-in REST, BuildRestCatalogOptionsFromServer for user-created
ones) and prevented us from recording pg_depend edges for tables that
use the short names.

Pre-create three iceberg_catalog servers at extension upgrade time:
pg_lake_postgres_catalog, pg_lake_object_store_catalog, and
pg_lake_rest_catalog.  The prefixed long names cannot collide with
names users may already have, notably the very common
'CREATE SERVER postgres FOREIGN DATA WRAPPER postgres_fdw'.

User-facing DDL stays identical: users keep writing
WITH (catalog='rest' / 'postgres' / 'object_store').  A new
ResolveCatalogServerName helper maps short -> long at server lookup
time, and opts->catalog continues to carry the short user-facing name
so error messages, the cross-catalog DML guard, and the token cache
key never expose the long names.

The built-in servers are pure structural anchors: ALTER OPTIONS,
ALTER OWNER, RENAME, and DROP are all blocked.  Configuration for
the built-in REST catalog continues to live in GUCs.  The resolution
path collapses to a single BuildRestCatalogOptionsFromServer that
applies GUC defaults first and then any server options (always empty
for the built-ins), so the built-in REST and user-created REST go
through one code path.

The long names are rejected as user-facing catalog= values by
IsRestCatalog, the default_catalog GUC check hook, and create_table.c
with a clear hint pointing to the short form.

Tables created with the short reserved names now record a pg_depend
edge against the corresponding built-in server, so DROP EXTENSION
CASCADE cleans them up via the standard dependency walker.

ValidateIcebergCatalogServerDDL now also rejects:
  - CREATE SERVER with one of the internal long names
  - ALTER SERVER ... RENAME TO one of the internal long names
  - ALTER SERVER OPTIONS on any built-in server (immutable anchors)
  - ALTER SERVER ... OWNER TO on any built-in server

Two small static helpers (RejectIfBuiltinCatalogServerName and
RejectIfModifyingBuiltinCatalogServer) collapse the repeating
"is built-in" + ereport patterns into single calls.

Upgrade safety: the --3.3--3.4 script does a pre-flight check that
errors with a clear hint if any of the three long names already
exists in the target database, so ALTER EXTENSION UPDATE fails
loudly rather than partway through.

Tests: 23 new cases in test_iceberg_catalog_server.py covering the
three servers exist and are extension-owned, CREATE / RENAME-TO /
ALTER OPTIONS / ALTER OWNER / DROP on the long names all blocked,
catalog= with a long name rejected at CREATE TABLE, the pg_depend
edge for the built-in postgres catalog, and -- the original collision
concern -- a pre-existing postgres_fdw server literally named
'postgres' coexists with pg_lake_postgres_catalog.

Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
Now that every REST catalog -- built-in and user-created -- is backed by
a real iceberg_catalog foreign server, the server's OID is a stable,
unambiguous identifier.  This commit pivots the in-memory machinery off
case-insensitive name comparison and onto OID equality.

* RestCatalogOptions gains a leading `Oid serverOid` field, populated in
  BuildRestCatalogOptionsFromServer from server->serverid and propagated
  by CopyRestCatalogOptions.  The struct comment is updated to call out
  that `catalog` is now purely a user-facing label kept around for error
  messages, and that `serverOid` is the canonical identity.

* The per-catalog OAuth token cache is rekeyed from a NAMEDATALEN char
  buffer to sizeof(Oid).  BuildTokenCacheKey is removed; the lookup site
  now passes &opts->serverOid directly, with a defensive
  Assert(OidIsValid(opts->serverOid)) so a future caller that forgets to
  resolve options cannot silently funnel every catalog into the same
  cache slot.  InvalidateRestTokenCache keeps its existing full-flush
  behavior on any pg_foreign_server change; targeted invalidation is
  not worth the per-entry bookkeeping at the rate ALTER SERVER actually
  happens.

* The cross-transaction "same REST catalog throughout" guard in
  BindRelationToXactRestCatalog and the belt-and-suspenders branch of
  RecordRestCatalogRequestInTx now compare serverOid directly instead
  of pg_strcasecmp'ing catalog names.  The two user-facing names are
  still surfaced in errdetail so the message remains in the terms the
  user typed.  This also closes the corner case where the same physical
  server is referenced via different casings in successive statements:
  they now collapse to the same OID and are treated as one catalog.

Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
Close two DDL protection gaps in ValidateIcebergCatalogServerDDL:

- Block ALTER EXTENSION pg_lake_iceberg DROP SERVER/FOREIGN DATA WRAPPER
  for iceberg_catalog objects, preventing superuser detachment of the
  DEPENDENCY_EXTENSION edge that shields them from standalone DROP.
- Block ALTER FOREIGN DATA WRAPPER iceberg_catalog entirely, preventing
  replacement of the validator or addition of a handler.

Both require superuser and are defense-in-depth against accidental
misuse that would silently weaken the protection model.

Add BindRelationToXactRestCatalog call to postgresExecForeignTruncate
so TRUNCATE gets the same early, friendly cross-catalog mismatch error
as INSERT/UPDATE/DELETE, instead of deferring to the less informative
RecordRestCatalogRequestInTx fallback at pre-commit time.

Also includes small hardening and cleanup:
- Guard catalog/host pstrdup in CopyRestCatalogOptions against NULL
  for consistency with all other string fields.
- Add cross-reference comments on the two RecordIcebergCatalogServer-
  Dependency call sites in create_table.c.
- Require postgres_fdw in coexistence test instead of silently skipping.
- Update test_reject_options_on_non_server to match new ALTER FDW error.

Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
- Simplify GetValidCatalogOptionsHint: drop the unnecessary static +
  TopMemoryContext caching pattern; the string is only used on error
  paths where ereport copies it into ErrorContext before aborting, so
  a plain CurrentMemoryContext allocation is correct and sufficient.

- Extract EnsureXactBoundToRestCatalog helper: unify the duplicated
  "resolve catalog options, bind-or-check server OID" logic from
  BindRelationToXactRestCatalog and RecordRestCatalogRequestInTx into
  a single static helper, eliminating ~60 lines of duplication.

- Assert single catalog name in commit path: replace the silent
  "last one wins" catalogName assignment in PostAllRestCatalogRequests
  with a capture-first + Assert-subsequent pattern, turning the TODO
  into an explicit invariant check.

- Fix rest_auth_type error hint: include the missing "default" value
  and reorder to "default", "oauth2", "horizon" (most common first).

Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_reconfig branch from 76408f8 to 138facc Compare June 4, 2026 18:34
@sfc-gh-npuka sfc-gh-npuka merged commit 076ee66 into main Jun 4, 2026
125 of 127 checks passed
@sfc-gh-npuka sfc-gh-npuka deleted the naisila/catalog_reconfig branch June 4, 2026 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

New Catalog Configuration

4 participants